Improving Association Rule Mining By Defining A Novel Data Structure

نویسنده

  • Vinayak Suresh Shukla
چکیده

1Student,PES’s Modern College Of Engineering,Pune 5. 2HOD,Computer Engineering Dept, PES’s Modern College Of Engineering, Pune 5. ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract In recent years, growth in digital data storage in rapidly increased due to ease of use and lower coast digital storage media. This data is high dimensional and heterogeneous in nature. The process of knowledge discovery is being affected due to high dimensional and heterogeneous data. This process can be abbreviated as association rule mining (ARM). Though, many association rule mining algorithms have been proposed in recent years to deal with large volume of data, the mining process under-performs when the data size is very large in terms of records. Hence the aim of this work is not to design a new algorithm for mining, but to design a new data structure to store data reliably .The original data is simplified, recognized and access time increased for that data, to meet up efficiency in terms of time and main memory requirements. Lower main memory requirements and faster data access are achieved by means of Shuffling, Inverted Index Mapping and Run Length Encoding. Hence the resulting data structure can be used along with the existing association rule mining algorithms to speed up mining and reducing main memory requirements, without changing original algorithms. This is further improved by replacing Run Length Encoding by Modified Run Length Encoding Algorithm for better memory utilization and efficiency of mining algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Method for Selecting the Supplier Based on Association Rule Mining

One of important problems in supply chains management is supplier selection. In a company, there are massive data from various departments so that extracting knowledge from the company’s data is too complicated. Many researchers have solved this problem by some methods like fuzzy set theory, goal programming, multi objective programming, the liner programming, mixed integer programming, analyti...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

New Approaches to Analyze Gasoline Rationing

In this paper, the relation among factors in the road transportation sector from March, 2005 to March, 2011 is analyzed. Most of the previous studies have economical point of view on gasoline consumption. Here, a new approach is proposed in which different data mining techniques are used to extract meaningful relations between the aforementioned factors. The main and dependent factor is gasolin...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

A Frame Work for Frequent Pattern Mining Using Dynamic Function

Discovering frequent objects (item sets, sequential patterns) is one of the most vital fields in data mining. It is well understood that it require running time and memory for defining candidates and this is the motivation for developing large number of algorithm. Frequent patterns mining is the paying attention research issue in association rules analysis. Apriori algorithm is a standard algor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017